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Abstract 

This paper describes a new algorithm for hyperspectral image unmixing. Most of the unmixing 
algorithms proposed in the literature do not take into account the possible spatial correlations 
between the pixels. In this work, a Bayesian model is introduced to exploit these correlations. 
The image to be unmixed is assumed to be partitioned into regions (or classes) where the statistical 
properties of the abundance coefficients are homogeneous. A Markov random field is then proposed 
to model the spatial dependency of the pixels within any class. Conditionally upon a given class, each 
pixel is modeled by using the classical linear mixing model with additive white Gaussian noise. 
This strategy is investigated the well known linear mixing model. For this model, the posterior 
distributions of the unknown parameters and hyperparameters allow ones to infer the parameters of 
interest. These parameters include the abundances for each pixel, the means and variances of the 
abundances for each class, as well as a classification map indicating the classes of all pixels in the 
image. To overcome the complexity of the posterior distribution of interest, we consider Markov 
chain Monte Carlo methods that generate samples distributed according to the posterior of interest. 
The generated samples are then used for parameter and hyperparameter estimation. The accuracy 
of the proposed algorithms is illustrated on synthetic and real data. 

Index Terms 

Bayesian inference, Monte Carlo methods, spectral unmixing, hyperspectral images, Markov 
random fields, Potts-Markov model. 
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I. Introduction 

Since the early 90's, hyperspectral imagery has been receiving growing interests in various 
fields of applications. For example, hyperspectral images have been recently used successfully 
for mapping the timber species in tropical forestry [[T|. Hyperspectral image analysis involves 
many technical issues such as image classification, image segmentation, target detection 
and the crucial step of spectral unmixing. The problem of spectral unmixing has been 
investigated for several decades in both the signal processing and geoscience communities 
where many solutions have been proposed (see for instance [2] and f3\ and references 
therein). Hyperspectral unmixing consists of decomposing the measured pixel reflectances 
into mixtures of pure spectra whose fractions are referred to as abundances. Assuming the 
image pixels are linear combinations of pure materials is very common in the unmixing 
framework. More precisely, the linear mixing model (LMM) considers the spectrum of 
a mixed pixel as a linear combination of endmembers 0. The LMM requires to have 
known endmember signatures. These signatures can be obtained from a spectral library or by 
using an endmember extraction algorithm (EEA). Some standard EEAs are reviewed in 
Once the endmembers that appear in a given image have been identified, the corresponding 
abundances have to be estimated in a so-called inversion step. Due to obvious physical 
considerations, the abundances have to satisfy positivity and sum-to-one constraints. A lot of 
inversion algorithms respecting these constraints have been proposed in the literature. The 
fully constrained least squares (FCLS) [[5| and scaled gradient (SGA) [[6| algorithms are 
two optimization techniques that ensure the positivity and sum-to-one constraints inherent to 
the unmixing problem. Another interesting approach introduced in [7] consists of assigning 
appropriate prior distributions to the abundances and to solve the unmixing problem within a 
Bayesian framework. However, all these inversion strategies have been developed in a pixel- 
by-pixel context and, consequently, do not exploit the possible spatial correlations between the 
different pixels of the hyperspectral image. In this paper, we show that taking these spatial 
correlations into account allows one to improve the unmixing procedure. More precisely, 
the Bayesian algorithm initially developed in [|7| is modified to introduce spatial constraints 
between the abundance coefficients to be estimated. 

Within a Bayesian estimation framework, a very popular strategy for modeling spatial 
information in an image is based on Markov random fields (MRFs). MRFs have been widely 
used in the image processing literature to properly describe neighborhood dependance be- 
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tween image pixels. MRFs and their pseudo-likelihood approximations have been introduced 
by Besag in [8]. They have then been popularized by Geman in [9] by exploiting the Gibbs 
distribution inherent to MRFs. There are mainly two approaches that can be investigated to 
model spatial correlations between the abundances of an hyperspectral image with MRFs. 
The first idea is to define appropriate prior distributions for the abundances highlighting 
spatial correlations. This approach has been for instance adopted by Kent and Mardia in 
pO[ where several techniques have been introduced for mixed-pixel classification of remote 
sensing data. These techniques rely on a fuzzy membership process, which implicitly casts 
the achieved classification task as a standard unmixing problen^^ Modeling the abundance 
dependencies with MRFs makes this approach particularly well adapted to unmix images 
with smooth abundance transition throughout the scene. 

Conversely, this paper proposes to exploit the pixel correlations in an underlying mem- 
bership model. This standard alternative strategy allows more flexibility and appears more 
suited for images composed of distinct areas, as frequently encountered in remote sensing 
applications. Moreover, this approach has the great advantage of easily generalizing the 
Bayesian algorithms previously introduced in [|7|, pT| , as detailed further in the manuscript. 
It consists of introducing labels that are assigned to the pixels of the image. Then MRFs are 
not assigned on the abundances directly but on these hidden variables, leading to a softer 
classification. More precisely, to take into account the possible spatial correlations between the 
observed pixels, a Potts-Markov field p2| is chosen as prior for the labels. This distribution 
enforces the neighboring pixels to belong to the same class. Potts-Markov models have been 
extensively used for classification/segmentation of hyperspectral data in the remote sensing 
and image processing literatures p3|-p^. Note that other research works, such as fl9\ and 



1 20 1, have proposed alternative strategies of modeling spatial correlations between pixels for 
classification of hyperspectral images. All these works have shown that taking into account 
the spatial correlations is of real interest when analyzing hyperspectral images. 

This paper proposes to study the interest of using MRFs for unmixing hyperspectral images. 
More precisely, the Bayesian unmixing strategy developed in f7\ is generalized to take into 
account spatial correlations between the pixels of a hyperspectral image. The hyperspectral 
image to be analyzed is assumed to be partitioned into homogeneous regions (or classes) 

'Note that, to our knowledge, the Kent and Mardia's paper is one of the eariiest work explicitly dealing with linear 
unmixing of remotely sensed images. 
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in which the abundance vectors have the same first and second order statistical moments 
(means and covariances). This assumption implies an implicit image classification, modeled 
by hidden labels whose spatial dependencies follow a Potts-Markov field. Conditionally 
upon these labels, the abundance vectors are assigned appropriate prior distributions with 
unknown means and variances that depend on the pixel class. These prior distributions ensure 
the positivity and sum-to-one constraints of the abundance coefficients. They are based on 
a reparametrization of the abundance vectors and are much more flexible than the priors 
previously studied in ||7|, pT| or [11|. Of course, the accuracy of the abundance estimation 



procedure drastically depends on the hyperparameters associated to these priors. This paper 
proposes to estimate these hyperparameters in a fully unsupervised manner by introducing a 
second level of hierarchy in the Bayesian inference. Non-informative prior distributions are 
assigned to the hyperparameters. The unknown parameters (labels and abundance vectors) 
and hyperparameters (prior abundance mean and variance for each class) are then inferred 
from their joint posterior distribution. Since this posterior is too complex to derive closed- 
form expressions for the classical Bayesian estimators, Markov chain Monte Carlo (MCMC) 
techniques are studied to alleviate the numerical problems related to the LMM with spatial 
constraints. MCMC allow one to generate samples asymptotically distributed according to 
the joint posterior of interest. These samples are then used to approximate the Bayesian 
estimators, such as the minimum mean square error (MMSE) or the maximum a posteriori 
estimators. Note that the underlying classification and abundance estimation problems are 
jointly solved within this Bayesian framework. 

The paper is organized as follows. The unmixing problem associated to the LMM with 
spatial correlations is formulated in |llj Section III introduces a hierarchical Bayesian model 



appropriate to this unmixing problem. The MCMC algorithm required to approximate the 



Bayesian LMM estimators is described in Section IV Simulation results conducted on 
simulated and real data are provided in Sections |V] and |Vlj Finally, conclusions related 
to this work are reported in Section |VII 



II. Technical background and problem formulation 

A. Unmixing statistical model 

As highlighted in the previous section, the LMM has been mainly proposed in the remote 
sensing literature for spectral unmixing. The LMM assumes that the spectrum of a given 
pixel is a linear combination of deterministic endmembers corrupted by an additive noise [|2j 
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considered here as white Gaussian. More specifically, the observed L-spectrum of a given 
pixel p is defined as 

Up = Map + Up (1) 

where L is the number of spectral bands, M = [mi, . . . , mn] is a known L x R matrix 
containing the L-spectra of the endmembers, is the R x 1 abundance vector, R is the 
number of endmembers that are present in the image and Up is the noise vector. The vector 
Up is classically assumed to be an independent and identically distributed (i.i.d.) zero-mean 
Gaussian sequence with unknown variance 

np\s^ r^X{OL,s^lL) (2) 

where is the L x L identity matrix. Note that the noise is the same for all pixels of the 
hyperspectral image and does not vary from one pixel to another, which has been a common 



assumption widely admitted in the hyperspectral literature p2|-[24|. 

Considering an image of P pixels, standard matrix notations can be adopted leading to 
Y = [t/i, ...yp] and A = [ai, ...,ap]. 

B. Introducing spatial dependencies between abundances 

We propose in this paper to exploit some spatial correlations between the pixels of the 
hyperspectral image to be analyzed. More precisely, it is interesting to consider that the 
abundances of a given pixel are similar to the abundances of its neighboring pixels. Formally, 
the hyperspectral image is assumed to be partitioned into K regions or classes. Let C 
{1, . . . , P} denote the subset of pixel indexes belonging to the /cth class. A label vector of 
size P X 1 denoted as z = [zi, . . . , zpf'" with Zp G {1, . . . , K} is introduced to identify the 
class to which each pixel p belongs {p = 1, . . . , P). In other terms Zp = k if and only if 
p eX^. In each class, the abundance vectors to be estimated are assumed to share the same 
first and second order statistical moments, i.e., VA; G {1, . . . , K} , y{p,p') E Xk x Xk 

E [ap] = E [ttp/] = fi^ 



E 



E 



(3) 



Therefore, the kth class of the hyperspectral image to be unmixed is fully characterized by 
its abundance mean vector and the abundance covariance matrix. 
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C. Markov random fields 

To describe spatial correlations between pixels, it is important to properly define a neighbor- 
hood structure. The neighborhood relation between two pixels i and j has to be symmetric: 
if i is a neighbor of j then j is a neighbor of i. This relation is applied to the nearest 
neighbors of the considered pixel, for example the fourth, eighth or twelfth nearest pixels. 
Fig. [T] shows two examples of neighborhood structures. The four pixel structure or 1-order 
neighborhood will be considered in the rest of the paper. Therefore, the associated set of 
neighbors, or cliques, has only vertical and horizontal possible configurations (see [[8j, [|9| 
for more details). 




Fig. 1. 4-pixel (left) and 8-pixel (right) neighborhood structures. The considered pixel appear as a black circle whereas 
its neighbors are depicted in white. 



Once the neighborhood structure has been established, the MRF can be defined. Let Zp 
denote a random variable associated to the pth pixel of an image of P pixels. In the context 
of hyperspectral image unmixing, the variables zi, . . . ,zp indicate the pixel classes and take 
their values in a finite set {1, . . . , K} where K is the number of possible classes. The whole 
set of random variables {zi, . . . , zp} forms a random field. An MRF is then defined when 
the conditional distribution of Zi given the other pixels z.i only depend on its neighbors Zv(j)5 
i.e., 

f {zi\z.i) = f {zi\zv^i)) (4) 

where V{i) is the neighborhood structure considered and z.i = {zfj ^ %]. 

Since the pioneer work of Geman [9|, MRFs have been widely used in the image processing 



community as in [25|, [26|. The hyperspectral community has also recently exploited the 
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advantages of MRFs for hyperspectral image analysis [[T4|, |17|, [27|. However, to our 
knowledge, MRFs have not been studied for hyperspectral image unmixing. MRFs provide 
an efficient way of modeling correlations between pixels, which is adapted to the intrinsic 
properties of most images. Two specific MRFs are appropriate for image analysis: the Ising 
model for binary random variables and the Potts-Markov model that is a simple generalization 
to more-than-two variables p2| . This paper focuses on the Potts-Markov model since it is 



very appropriate to hyperspectral image segmentation [17|. Given a discrete random field z 



attached to an image with P pixels, the Hammersley-Clifford theorem yields 

p 



1 



exp 



p=i p'ev(p) 



(5) 



where /3 is the granularity coefficient, G'(/3) is the normalizing constant or partition function 



[28 1 and is the Kronecker function 

f 1, if X = 0, 

5(x) = 

I 0, otherwise. 

Note that drawing a label vector z = [zi,...,zp] from the distribution (|5]) can be easily 
achieved without knowing G{(3) by using a Gibbs sampler (the corresponding algorithmic 
scheme is summarized in [29 1). However, a major difficulty with the distribution (|5]) comes 
from the partition function that has no closed- form expression and depends on the unknown 
hyperparameter (3. The hyperparameter (3 tunes the degree of homogeneity of each region in 
the image. Some simulations have been conducted to show the influence of this parameter 
on image homogeneity. Synthetic images have been generated from a Potts-Markov model 
with K = 3 (corresponding to three gray levels in the image) and a 1 -order neighborhood 
structure. Fig. |2] indicates that a small value of (3 induces a noisy image with a large number of 
regions, contrary to a large value of (3 that leads to few and large homogeneous regions. It is 
unnecessary to consider values of /3 > 2 since for the 1 -order neighborhood structure adopted 
here, ''When (3 > 2, the Potts-Markov model is almost surely concentrated on single-color 
images" [30, p. 237]. Note however that for larger neighborhood systems, a smaller value 
of (3 would be enough to obtain uniform patches in Potts realizations since, for example, (3 



is expected to be about twice for an 2-order neighborhood structure [31 1. In this work, the 
granularity coefficient (3 will be fixed a priori. However, it is interesting to mention that the 



estimation of /3 might also be conducted by using the methods studied in [32|, [33 1 and [34|. 
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Fig. 2. Synthetic images generated from a Potts-Markov model with (from left to right) P = 0.8, 1.4, 2. 



D. Abundance Reparametrization 

As explained before, the fraction vectors ap should satisfy positivity and sum-to-one 
constraints defined as 

Or > 0, Vr = 1 i?, 

To ensure that these abundance constraints are satisfied, we have considered a reparametriza- 
tion for positive parameters summing to one that was introduced in [10] for the spectral 
unmixing of satellite images. Note that this reparametrization has also shown interesting 



(6) 



results for a pharmacokinetic problem [35| and has been recently applied to hyperspectral 



unmixing [36|. This reparametrization consists of rewriting the abundances as a function 



of random variables that will be referred to as logistic coefficients in the rest of the paper. 
A logistic coefficient vector tp = [tip. . . is assigned to each abundance vector ap, 

according to the relationship 



exp(t^,p) 



Er=i exp(t^,p) 



(7) 



Initially, the spatial dependencies resulting from the image partitioning described in Section 



II-B are based on the first and second order moments of the abundance vectors a„. However, 



the spatial constraints defined in ([3]) can be easily adapted when using logistic coefficient 
vectors. Indeed, in each class, the unknown logistic coefficient vectors are assumed to share 
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the same first and second order moments, i.e., VA; G {1, . . . , K} , y{p,p') EX^xXk 

x})^ = 'Ej\tp\zp = k\ = E \tp/ 1 Zpi = 



E 
E 



- V'fc) - V'fc) \zp = k 



(8) 



Witli this reparametrization, the A;th class is fully characterized by the unknown hyperparam- 
eters i/^fc and S^. 

III. Hierarchical Bayesian model 

This section investigates the likelihood and the priors inherent to the LMM for the spec- 
tral unmixing of hyperspectral images, based on Potts-Markov random fields and logistic 
coefficients. 



A. Unknown parameters 

The unknown parameter vector associated to to the LMM unmixing strategy is denoted as 

where is the noise variance, z is the label vector and T = [ti,...,tp] with tp = 
[ti p, . . . , t^p]^ (p = 1,...,P) is the logistic coefficient matrix used for the abundance 
reparametrization. Note that the noise variance has been assumed to be unknown in the 



present paper, contrary to the model considered in |10| 



B. Likelihood 



The additive white Gaussian noise sequence of the LMM allows one to write|^t/p|tp, 
J\f (Map{tp), s^Il) (p = 1, • • • , P)- Therefore the likelihood function of Up is 



/ [Vp \tp,s^) oc -^exp 



\yp-Map{tp)\\^ ' 
2s2 



(9) 



where oc means proportional to and ||a3|| = V x^x is the standard P' norm. By assuming 
independence between the noise sequences rip (p = 1, . . . , P), the likelihood of the P image 
pixels is 

p 

f{Y\T,s^)=\[f{yp\tp,s^). (10) 

p=i 

"Note that the dependence of the abundance vector Op on the logistic coefficient vector tp through ([7} has been 
explicitly mentioned by denoting ap — ap(tp). 
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C. Parameter priors 

This section defines the prior distributions of the unknown parameters and their associated 
hyperparameters that will be used for the LMM. The directed acyclic graph (DAG) for the 
parameter priors and hyperpriors for the considered model is represented in Fig. |3j 




Fig. 3. DAG for the parameter priors and iiyperpriors (ttie fixed parameters appear in dashed boxes) for the LMM. 



1) Label prior: The prior distribution for the label vector z = [zi, . . . ,zp\ introduced 
in paragraph |II-C is a Potts-Markov random field with a 1-order neighborhood and a known 
granularity coefficient (3 (fixed a priori). The resulting prior distribution can be written as in 
(|5]) where V{p) is the 1-order neighborhood depicted in Fig. [T](left). 



2) Logistic coefficient prior: Following the approach described in Section II-B each 
component of tp is assumed to be distributed according to a Gaussian distribution. In addition, 
as highlighted in |II-D| (see ([8])), the mean and variance of the logistic coefficients depend on 
the class to which the corresponding pixel belong. Therefore, the prior distribution for the 
tp is explicitly defined conditionally upon the pixel label 



tr,p I 



where the hyperparameters il)r,k and a^^^ depend on the associated pixel class k. As suggested 
in Section |I} a hierarchical Bayesian algorithm will be used to estimate these hyperparameters. 
For a given pixel p, by assuming prior independence between the coefficients ti p, . . . , t/jp, 
the prior distribution for the vector t = [ti p, . . . , tji^p]^ is 

/ {tp\zp = k, Sfc) ~ Af (iP,, Sfc) (12) 
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where tj^^ = [ipi^k, ■ ■ ■ , V'R.fc]^ and S^. = diag (0"^^^) is tho R x R diagonal matrix whose 
diagonal elements are cr^^. 

By assuming prior independence between the P vectors ti,...,tp, the full posterior 
distribution for the logistic coefficient matrix T is 

K 

k=i peXfe 

with * = [V^i, . . . , V'a'] and S = {Si, . . . , S^^}. 

3j Noise variance prior: A conjugate inverse-gamma distribution is assigned to the noise 
variance 

s>,5~xe;(z/,5) (14) 



where v and 5 are adjustable hyperparameters. This paper assumes z/ = 1 (as in [37| or 



1 38 1) and estimates 5 jointly with the other unknown parameters and hyperparameters (using 



a hierarchical Bayesian algorithm). 

D. Hyperparameter priors 

Hierarchical Bayesian algorithms require to define prior distributions for the hyperparam- 
eters. A particular attention has to be devoted to the hyperparameters tpr,k and since they 
fully describe the different classes partitioning the image. The prior distributions for ipr,k 
and cr^^ are conjugate distributions. More precisely, a vague inverse-gamma distribution is 
chosen for the logistic coefficient variance a^^, i.e., 

a,%|e,7~X6^(e,7) (15) 

where ^ and 7 have been tuned to ^ = 1 and 7 = 5 (in order to obtain a large variance). 
Moreover, a centered Gaussian distribution with unknown variance has been chosen as prior 
for the logistic coefficient mean 

iPr,k\v^ ~Af{0,v^) (16) 

where v'^ is another adjustable hyperparameter. By assuming independence between the 
different mean vectors i/^^, as well as between the covariance matrices Sfc for k = 1, . . . , K, 
the full priors for the two hyperparameters ^ and S can be expressed as 

/(*i«vnn(^)%xp(-§) (17) 

fc=lr=l ^ ^ ^ ^ 
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K R 



k=l r=l 



7 



(18) 



Jeffreys' priors are chosen for the hyperparameters 5 and v"^ (see, e.g., |39 p. 131] for details 
including computations) 



/((5)(x^1m+(5), /(i;2)oc^lM+(t;'). 



S' 



(19) 



where 1m+(-) denotes the indicator function defined on IR+. These choices, also adopted in 
p7| , [ [40| , reflect the lack of knowledge regarding these two hyperparameters. At this last 
hierarchy level within the Bayesian inference, the hyperparameter vector can be defined as 
n = {*,S,i;2,(5}. 



E. Joint distribution 

The joint posterior distribution of the unknown parameters and hyperparameters is classi- 
cally defined using the hierarchical structure 



f{@,n\Y) = f{Y\@)f{@\n)f{n). 



Straightforward computations yield the following posterior 

/ \ — P 



p=l 



2s2 



X exp 

X 



P 



E 



v) 



X 



n 



1 



r,k '^r,k 




(tj-,p '^r,k) 



(20) 



(21) 



with Uk = card(Xfc). The posterior distribution pT] ) associated to the LMM is too complex to 
obtain closed-from expressions for the MMSE or MAP estimators of the unknown parameter 
vector ©. To alleviate this problem, we propose to use MCMC methods to generate samples 



that are asymptotically distributed according to pT] ). The generated samples are then used to 
approximate the Bayesian estimators. The next section studies a hybrid Gibbs sampler that 



generates samples asymptotically distributed according to the posterior distribution (21) 



January 20, 2011 



DRAFT 



13 



IV. Hybrid Gibbs sampler 

This section studies a Metropolis-within-Gibbs sampler that generates samples according to 
the joint posterior /(©, The proposed sampler iteratively generates samples distributed 

according to the conditional distributions detailed below. 



A. Conditional distribution of the label vector z 

For each pixel p (p = 1, . . . ,P), the class label is a discrete random variable whose 
conditional distribution is fully characterized by the probabilities 



P = k\z.p, tp, V'fc, Sfe] oc fitp\zp = k, Sfc)/ I 



^p\^-p) 



(22) 



where k = 1, ...,K (K is the number of classes) and z.p denotes the vector z whose pth 
element has been removed. These posterior probabilities can be expressed as 

P [zp = k\z.p,tp,iJ)f„'Sk] 
p 



oc exp 



1 — 1/2 

X |Sfc| ' exp 



p=i p'ev(p) 

1 



(23) 



tp-xp,fi:-'itp-xp 



where |Sfe| = nf=i'^rfc- Note that the posterior probabilities of the label vector z in ( [23] ) 
define an MRF. Consequently, sampling from this conditional distribution can be achieved 
using the scheme detailed in [29], i.e., by drawing a discrete value in the finite set {1, ... , K} 



with the probabilities (23) 



B. Conditional distribution of logistic coefficient matrix T 
For each pixel p, the Bayes theorem yields 

/ {tp\zp = k, Efc, Vp) oc / {Vpltp, s^) / {tp\zp = k, T,^) . 

Straightforward computations lead to 

/ {tp\zp = k,'i/)f„'Ek,yp,s'^) 
1 



oc 



L 

1 ^ 2 



,2/ e^Pl 2s2ll-P 



y - Map{tp) 



X |Sfc| 2 exp 



{tp-i^,fi:-,'{tp-i: 



(24) 
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Unfortunately, it is too difficult to generate samples distributed according to < [24] ). Therefore, 
a Metropolis-Hastings step is used, based on a random walk method pT| p. 245] with a 
Gaussian distribution A/'(0, u^) as proposal distribution. The variance of the instrumental 
distribution has been fixed to obtain an acceptance rate between 0.15 and 0.5 as recommended 



in [42 1 



C. Conditional distributions of the noise variance 
The Bayes theorem yields 

p 

f {s'\Y,T,6) f {s'\6)l[f{y^\t„s'). 

p=i 

As a consequence, s'^\Y,T,5 is distributed according to the inverse-Gamma distribution 

f LP J^\\y„- MaJt„)P\ 
s^\Y,T,S~ig\^— + l,6 + J2 — 2 ) ■ ^^^^ 

D. Conditional distribution of ^ and S 

For each endmember r (r = 1, . . . , R) and each class k (k = 1, . . . , K), the conditional 
distribution of ^^./t can be written as 

/ (V'r,fc|2,*r,Crr,fc,t^^) 

OC / W f {tr,p\Zp = k, i)r,k, (^r,k) ■ (26) 

Similarly, the conditional distribution of is 

/ (0"r,fc|2> tr, llJr,k) OC / (cT^fc) JJ / (tr,pkp = k, lljr,k, CX^,fc) ■ (27) 

Straightforward computations allow one to obtain the following results 

ijr,k\z, tr, v'^Afi f'''^' , ) (28) 

al,\z,U^i;.,k-XG fy + 1,7 + E (29) 



with t,. z, = — V t 



■-r,p 
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E. Conditional distribution of and 8 

The conditional distributions of and b are the following inverse-gamma and gamma 
distributions, respectively 

V. Simulation results on synthetic data 



TABLE I 

Actual and estimated abundance mean and variance in each class. 





Actual values 


Estimated values 


Class 1 


Ml — E[ap, peij 


[0.6,0.3,0.1]^ 


[0.57,0.3,0.13]^ 


Var[ap^r. peij (xlO^^) 


[5,5,5]^ 


[5.6,6.7,6.7]'^ 


Class 2 


At2 = E[ap, peij 


[0.3,0.5,0.2]^ 


[0.29,0.49,0.2]^ 


Var[ap,r, peia] (xlO^^) 


[5,5,5]^ 


[4.5,5.2,8.1]^ 


Class 3 


A*3 = E[ap, PGI3] 


[0.3,0.2,0.5]^ 


[0.3,0.2,0.5]"^ 


Var[ap,r, peXj] (xlO^^) 


[5,5,5]^ 


[4.6,5.7,10.2]^ 



Many simulations have been conducted to illustrate the accuracy of the proposed algorithm. 
The first experiment considers a 25 x 25 synthetic image with K = ?> different classes. The 
image contains R = 3 mixed components (construction concrete, green grass and micaceous 
loam) whose spectra (L = 413 spectral bands) have been extracted from the spectral libraries 
distributed with the ENVI package [43 1. A label map shown in Fig. |4] (left) has been generated 



using ^ with (3 = 1.1. 

The mean and variance of the abundances have been chosen for each class as reported in 
Table |Ij These values reflect the fact that the 1st endmember is more present in Class 1 (with 
average concentration of 60%), the 2nd endmember is more present in Class 2 (with average 
concentration of 50%) and the 3rd endmember is more present in Class 3 (with average 
concentration of 50%). In this simulation scenario, the abundance variance has been fixed 
to a common value 0.005 for all endmembers, pixels and classes. The generated abundance 
maps for the LMM are depicted in Fig. [5} Note that a white (resp. black) pixel in the fraction 
map indicates a large (resp. small) value of the abundance coefficient. The noise variance 
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Fig. 4. Left: the actual label map. Right: the label map estimated by the LMM hybrid Gibbs sampler. 

is chosen such as the average signal-to-noise ratio (SNR) is equal to SNR = 19dB, i.e. 
= 0.001. 
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Fig. 5. Top: abundance maps of the 3 pure materials for LMM. Bottom: abundance maps of the 3 pure materials estimated 
by the hybrid Gibbs sampler (from left to right: construction concrete, green grass, micaceous loam). 

The MMSE and MAP estimators for the unknown parameters can be computed from 
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samples generated with the Gibbs samplers presented in Section |IVj For instance, the marginal 
MAP estimates of the label vector Zmap are depicted in Fig. |4] (right) for the proposed 
hybrid Gibbs algorithm. The MMSE estimates of the abundances conditioned upon Zmap are 
shown in Fig. [5] A number of A^mc = 5000 iterations (including 500 bum-in iterations) has 
been necessary to obtain these results. The proposed algorithm generates samples distributed 
according to the full posterior of interest. Then, these samples can be used to compute, 
for instance, the posterior distributions of the mean vectors /^^ = E [ap] (k = 1, . . . , K, 
p G Xfc). These mean vectors, introduced in (|3]), are of great interest since they characterize 
each class. Therefore, as an additional insight, the histograms of the abundance means fif. 
estimated by the proposed algorithm have been depicted in Fig. [6] for the 2nd class, i.e., 
k = 2. Similar results have been obtained for the other classes. They are omitted here for 
brevity. Finally, the estimated abundance means and variances have been reported in Table |l] 
(last row). The estimated classes, abundance coefficients and abundance mean vectors are 
clearly in accordance with their actual values. 




0.5 1 0.5 1 0.5 1 



Fig. 6. Histograms of tlie abundance means /Xj. = /ifc,2, Mfc.s]^ estimated by tiie proposed hybrid Gibbs algorithm 

for the 2nd class (k — 2). 



The LMM hybrid Gibbs algorithm is compared respectively with its non-spatial constrained 
Bayesian counterpart developed in [|7|. The synthetic image shown in Fig.|4]has been analyzed 
by the initial algorithm of [|7| with the same number of iterations Amc in addition with the 
FCLS algorithm. As a criterion, the global mean square error (MSE) of the rth estimated 
abundances have been computed for each algorithm. This global MSE is defined as 

MSEl = -Y.{ar,p - ar,pf (30) 
p=i 
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where aj,,p denotes the MMSE estimate of the abundance a^.p- Table |ll] reports the different 
results showing that the algorithm developed in this paper (referred to as "Spatial") performs 
better than the non-spatial constrained algorithms (referred to as "Bayesian" and "FCLS"). 



TABLE II 

Global MSEs of each abundance component. 





FCLS 


Bayesian 


Spatial 


MSE? 


0.0019 


0.0016 


3.1 X 10"'' 


MSEi 


4.3 X 10"" 


4.1 X lO"'' 


8.98 X 10"^ 


MSEi 


0.0014 


0.0013 


2.35 X 10"" 



VI. Simulation results on AVIRIS Images 

A. Performance of the proposed algorithm 

This section illustrates the performance of the proposed spatial algorithm on a real hyper- 
spectral dataset, acquired over Moffett Field (CA, USA) in 1997 by the JPL spectro-imager 
AVIRIS. Many previous works have used this image to illustrate and compare algorithm 



performance with hyperspectral images [44|, [45 1. The first region of interest, represented 



in Fig. |7} is a 50 X 50 image. The data set has been reduced from the original 224 bands 
to L = 189 bands by removing water absorption bands. As in [7], a principal component 
analysis has been conducted as a processing step to determine the number of endmembers 
present in the scene. Then, the endmembers spectra have been extracted with the help of the 



endmember extraction procedure N-FINDR proposed by Winter in [46|. The R = ?> extracted 
endmembers, shown in Fig. [sj corresponds to soil, vegetation and wateij^ The algorithm 
proposed in Section |W] has been applied on this image with A^mc = 5000 iterations (with 
500 bum-in iterations). The number of classes has been fixed io K = A since prior knowledge 
on the scene allows one to identify 4 areas in the image: water point, lake shore, vegetation 
and soil. 

The estimated classification and abundance maps for the proposed hybrid Gibbs algorithm 
are depicted in Fig. |9] (left) and [10] (top). The results provided by the algorithm are very 



^^Note that the influence of the endmember extraction step on the unmixing results has been investigated in 
coupling the proposed algorithm with other EEAs. 



29 
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Fig. 7. Real hyperspectral data: Moffett field acquired by AVIRIS in 1997 (left) and the region of interest shown in true 
colors (right). 
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Fig. 8. The i? = 3 endmember spectra obtained by the N-FINDR algorithm. 



similar and in good agreement with results obtained on this image with an LMM-based 
Bayesian algorithm ||7| (Fig. 10 middle) or with the well-known FCLS algorithm [|5| (Fig. 
TOj bottom). 

The performance of the proposed algorithm has been also evaluated for different values 
of the number of endmembers R. The resulting classification maps for R = A and R = 5 
are given in Fig. |9] (middle and right). These maps show that the classification results are 
quite robust with respect to the number of endmembers. The corresponding abundance maps 
can be found in [29], as well as the results of the proposed algorithm when the number of 
classes vary. 

The computational time of the proposed method (combined with the N-FINDR procedure) 
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Fig. 9. Label map estimated by the LMM-based proposed algorithm for i? = 3 (left), i? = 4 (middle) and R = 5 (right). 
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Fig. 10. Top: abundance maps estimated by the proposed algorithm (from left to right: vegetation, water and soil). Middle: 
abundance maps estimated by the LMM-based Bayesian algorithm (from |7)). Bottom: fraction maps estimated by the FCLS 
algorithm |5]. 



has been compared with the computational times of two other unmixing algorithms when 
applied on this Moffett image: the FCLS algorithm (also combined with N-FINDR) and 
the constrained nonnegative matrix factorization (cNMF) algorithm that jointly estimates 
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the endmember matrix and the abundances [47|. The result^ are reported in Table III The 
proposed method (referred to as "Spatial") has the higher computational cost when compared 
to the two others, mainly due to the joint estimation of the labels and the abundance vectors. 
However, it provides more information about unmixing. In particular, the samples generated 
with the proposed Gibbs sampler can be used to determine confidence intervals for the 
estimated parameters. 



TABLE III 

Computational times of LMM-based unmixing algorithms. 





FCLS 


cNMF 


Spatial 


Times (s.) 


0.388 


2.5 X 10^ 


8.4 X 10^ 



B. Simulation on a larger image 




Fig. 11. AVIRIS image of 190 x 250 pixels extracted from Cuprite scene observed in composite natural colors. 

'^These simulations have been carried out with an unoptimized MATLAB 2007b 32bit implementation on a 
Core(TM)2Duo 2.66GHz computer. 
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The performance of the proposed Bayesian algorithm has also been evaluated on a larger 
real hyperspectral image. The selected scene has been extracted from the AVIRIS Cuprite 
image, acquired over a mining site in Nevada, in 1997. The geologic characteristics of the 
complete data have been mapped in p8|, p9|. The area of interest of size 190 x 250 is 



represented in Fig. 11 and has been previously studied in [50] to test the VGA algorithm 
with R = 14. Therefore, in this experiment, the same number of endmembers has been 
extracted by the VGA algorithm. The number of classes has been set to K = 14, which 
seems to be a sufficient value to capture the natural diversity of the scene. The proposed 
algorithm has been used to estimate the abundance and label maps related to the analyzed 



scene. These maps are depicted in Fig. 12 and 14 respectively. 




Fig. 12. Classification map for the 190 x 250 Cuprite area (K = 14). 



The proposed Bayesian inversion algorithm has been able to identify some regions similar 



to those recovered in [50|. To illustrate, the composition of two particular areas (marked as 



colored rectangles in Fig. 12) is investigated. Tables IV report the abundance means for the 
most significant endmembers that appear in the two highlighted regions. From these tables, 
one can conclude that the two classes represented in black and dark gray of the "blue" 
area are composed of very mixed pixels (the abundance of the most significant endmember 
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is 0.201). On the other hand, both classes in the "green" area are clearly dominated by 
the 6th endmember. By comparing its corresponding signature with the materials included 
in the USGS library spectra, this 6th endmember matches the Montmorillonite spectrum 
(see Fig. \T3\ . This result is in good agreement with the ground truth. Indeed, from [49|, 
Montmorillonnite is the most commonly found material in this area. 




0.5 1 1.5 2 2.5 

Wavelength 



Fig. 13. Comparison of the 6th endmember spectrum extracted by the VCA algorithm (solid line) with the Montmorillonite 
signature extracted from the USGS spectral library (dashed line). 



TABLE rV 

Abundance means for the most significant endmembers in each highlighted region. 





Green area 


light gray 


white 


Endm. 1 


0.001 


0.225 


Endm. 3 


0.045 


0.000 


Endm. 5 


0.098 


0.027 


Endm. 6 


0.839 


0.528 





Blue area 


black 


dark gray 


Endm. 1 


0.135 


0.044 


Endm. 9 


0.155 


0.158 


Endm. 10 


0.159 


0.127 


Endm. 13 


0.187 


0.206 



VII. Conclusions 

A new hierarchical Bayesian algorithm was proposed for hyperspectral image unmixing. 
Markov random fields were introduced to model spatial correlations between the pixels of the 
image. A hidden discrete label was introduced for each pixel of the image to identify several 
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Fig. 14. Fraction maps of the 190 x 250 Cuprite area. 

classes defined by homogeneous abundances (with constant first and second order statistical 
moments). The positivity and sum-to-one constraints on the abundances were handled by 
using an appropriate reparametrization defined by logistic coefficient vectors. We derived the 
joint posterior distribution of the unknown parameters and hyperparameters associated to the 
proposed Bayesian linear mixing model. An MCMC method was then studied to generate 
samples asymptotically distributed according to this posterior. The generated samples were 
then used to estimate the abundance maps as well as the underlying image labels. The results 
obtained on simulated data and on real AVIRIS images are very promising. Future works 
include the estimation of the granularity coefficient involved in Potts-Markov random fields. 
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